智能论文笔记

Semi-automatic tuning of coupled climate models with multiple intrinsic timescales: lessons learned from the Lorenz96 model

Redouane Lguensat , Julie Deshayes , Homer Durand , V. Balaji

分类：机器学习

2022-08-11

这项研究的目的是评估历史匹配的潜力（HM），以调整具有多尺度动力学的气候系统。通过考虑玩具气候模型，即两尺度的Lorenz96模型并在完美模型设置中生产实验，我们详细探讨了如何需要仔细测试几种内置选择。我们还展示了在参数范围内引入物理专业知识的重要性，这是运行HM的先验性。最后，我们重新审视气候模型调整中的经典过程，该程序包括分别调整慢速和快速组件。通过在Lorenz96模型中这样做，我们说明了合理参数的非唯一性，并突出了从耦合中出现的指标的特异性。本文也有助于弥合不确定性量化，机器学习和气候建模的社区，这是通过在每个社区使用的术语之间建立相同概念的术语并提出有希望的合作途径，从而使气候建模研究受益。

translated by 谷歌翻译

Deep Learning Models for River Classification at Sub-Meter Resolutions from Multispectral and Panchromatic Commercial Satellite Imagery

Joachim Moortgat , Ziwei Li , Michael Durand , Ian Howat , Bidhyananda Yadav , Chunli Dai

分类：计算机视觉 | 机器学习

2022-12-27

Remote sensing of the Earth's surface water is critical in a wide range of environmental studies, from evaluating the societal impacts of seasonal droughts and floods to the large-scale implications of climate change. Consequently, a large literature exists on the classification of water from satellite imagery. Yet, previous methods have been limited by 1) the spatial resolution of public satellite imagery, 2) classification schemes that operate at the pixel level, and 3) the need for multiple spectral bands. We advance the state-of-the-art by 1) using commercial imagery with panchromatic and multispectral resolutions of 30 cm and 1.2 m, respectively, 2) developing multiple fully convolutional neural networks (FCN) that can learn the morphological features of water bodies in addition to their spectral properties, and 3) FCN that can classify water even from panchromatic imagery. This study focuses on rivers in the Arctic, using images from the Quickbird, WorldView, and GeoEye satellites. Because no training data are available at such high resolutions, we construct those manually. First, we use the RGB, and NIR bands of the 8-band multispectral sensors. Those trained models all achieve excellent precision and recall over 90% on validation data, aided by on-the-fly preprocessing of the training data specific to satellite imagery. In a novel approach, we then use results from the multispectral model to generate training data for FCN that only require panchromatic imagery, of which considerably more is available. Despite the smaller feature space, these models still achieve a precision and recall of over 85%. We provide our open-source codes and trained model parameters to the remote sensing community, which paves the way to a wide range of environmental hydrology applications at vastly superior accuracies and 2 orders of magnitude higher spatial resolution than previously possible.

translated by 谷歌翻译

Neural Bandits for Data Mining: Searching for Dangerous Polypharmacy

Alexandre Larouche , Audrey Durand , Richard Khoury , Caroline Sirois

分类：机器学习

2022-12-10

Polypharmacy, most often defined as the simultaneous consumption of five or more drugs at once, is a prevalent phenomenon in the older population. Some of these polypharmacies, deemed inappropriate, may be associated with adverse health outcomes such as death or hospitalization. Considering the combinatorial nature of the problem as well as the size of claims database and the cost to compute an exact association measure for a given drug combination, it is impossible to investigate every possible combination of drugs. Therefore, we propose to optimize the search for potentially inappropriate polypharmacies (PIPs). To this end, we propose the OptimNeuralTS strategy, based on Neural Thompson Sampling and differential evolution, to efficiently mine claims datasets and build a predictive model of the association between drug combinations and health outcomes. We benchmark our method using two datasets generated by an internally developed simulator of polypharmacy data containing 500 drugs and 100 000 distinct combinations. Empirically, our method can detect up to 33\% of PIPs while maintaining an average precision score of 99\% using 10 000 time steps.

translated by 谷歌翻译

Training a Vision Transformer from scratch in less than 24 hours with 1 GPU

Saghar Irandoust , Thibaut Durand , Yunduz Rakhmangulova , Wenjie Zi , Hossein Hajimirsadeghi

分类：计算机视觉

2022-11-09

Transformers have become central to recent advances in computer vision. However, training a vision Transformer (ViT) model from scratch can be resource intensive and time consuming. In this paper, we aim to explore approaches to reduce the training costs of ViT models. We introduce some algorithmic improvements to enable training a ViT model from scratch with limited hardware (1 GPU) and time (24 hours) resources. First, we propose an efficient approach to add locality to the ViT architecture. Second, we develop a new image size curriculum learning strategy, which allows to reduce the number of patches extracted from each image at the beginning of the training. Finally, we propose a new variant of the popular ImageNet1k benchmark by adding hardware and time constraints. We evaluate our contributions on this benchmark, and show they can significantly improve performances given the proposed training budget. We will share the code in https://github.com/BorealisAI/efficient-vit-training.

translated by 谷歌翻译

Gemino: Practical and Robust Neural Compression for Video Conferencing

Vibhaalakshmi Sivaraman , Pantea Karimi , Vedantha Venkatapathy , Mehrdad Khani , Sadjad Fouladi , Mohammad Alizadeh , Frédo Durand , Vivienne Sze

分类：计算机视觉

2022-09-21

当网络条件恶化时，视频会议系统的用户体验差，因为当前的视频编解码器根本无法在极低的比特率下运行。最近，已经提出了几种神经替代方案，可以使用每个框架的稀疏表示，例如面部地标信息，以非常低的比特率重建说话的头视频。但是，这些方法在通话过程中具有重大运动或遮挡的情况下会产生不良的重建，并且不会扩展到更高的分辨率。我们设计了Gemino，这是一种基于新型高频条件超分辨率管道的新型神经压缩系统，用于视频会议。 Gemino根据从单个高分辨率参考图像中提取的信息来增强高频细节（例如，皮肤纹理，头发等），为每个目标框架的一个非常低分辨率的版本（例如，皮肤纹理，头发等）。我们使用多尺度体系结构，该体系结构在不同的分辨率下运行模型的不同组件，从而使其扩展到可与720p相当的分辨率，并且我们个性化模型以学习每个人的特定细节，在低比特率上实现了更好的保真度。我们在AIORTC上实施了Gemino，这是WEBRTC的开源Python实现，并表明它在A100 GPU上实时在1024x1024视频上运行，比比特率的比特率低于传统的视频Codecs，以相同的感知质量。

translated by 谷歌翻译

Can Shadows Reveal Biometric Information?

Safa C. Medin , Amir Weiss , Frédo Durand , William T. Freeman , Gregory W. Wornell

分类：计算机视觉 | 机器学习

2022-09-21

我们通过查看在弥漫表面上铸造的对象的阴影来研究个体的生物特征识别信息的问题。我们表明，通过最大似然分析，在代表性的情况下，阴影中的生物特征信息泄漏可以足够用于可靠的身份推断。然后，我们开发了一种基于学习的方法，该方法在实际设置中证明了这种现象，从而利用阴影中的微妙提示是泄漏的来源，而无需任何标记的真实数据。特别是，我们的方法依赖于构建由从每个身份的单个照片获得的3D面模型组成的合成场景。我们以完全无监督的方式将我们从合成数据中学到的知识转移到真实数据中。我们的模型能够很好地概括到真实的域，并且在场景中的几种变体都有坚固的范围。我们报告在具有未知几何形状和遮挡对象的场景中发生的身份分类任务中的高分类精度。

translated by 谷歌翻译

Seeing 3D Objects in a Single Image via Self-Supervised Static-Dynamic Disentanglement

Prafull Sharma , Ayush Tewari , Yilun Du , Sergey Zakharov , Rares Ambrus , Adrien Gaidon , William T. Freeman , Fredo Durand , Joshua B. Tenenbaum , Vincent Sitzmann

分类：计算机视觉 | 人工智能 | 机器学习

2022-07-22

人类的感知可靠地识别3D场景的可移动和不可移动的部分，并通过不完整的观测来完成对象和背景的3D结构。我们不是通过标记的示例来学习此技能，而只是通过观察对象移动来学习。在这项工作中，我们提出了一种方法，该方法在训练时间观察未标记的多视图视频，并学会绘制对复杂场景的单个图像观察，例如带有汽车的街道，将其绘制为3D神经场景表示，该表演将其分解为可移动和可移动和不可移动的零件，同时合理地完成其3D结构。我们通过2D神经地面计划分别参数可移动和不可移动的场景部分。这些地面计划是与接地平面对齐的2D网格，可以将其局部解码为3D神经辐射场。我们的模型通过神经渲染受过训练的自我监督。我们证明，使用简单的启发式方法，例如提取对象以对象的3D表示，新颖的视图合成，实例段和3D边界框预测，预测，预测，诸如提取以对象为中心的3D表示，诸如提取街道规模的3D场景中的各种下游任务可以实现各种下游任务。强调其作为数据效率3D场景理解模型的骨干的价值。这种分离进一步通过对象操纵（例如删除，插入和刚体运动）进行了现场编辑。

translated by 谷歌翻译

Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning

Homer Walke , Jonathan Yang , Albert Yu , Aviral Kumar , Jedrzej Orbik , Avi Singh , Sergey Levine

分类：机器人 | 机器学习

2022-07-11

强化学习（RL）算法有望为机器人系统实现自主技能获取。但是，实际上，现实世界中的机器人RL通常需要耗时的数据收集和频繁的人类干预来重置环境。此外，当部署超出知识的设置超出其学习的设置时，使用RL学到的机器人政策通常会失败。在这项工作中，我们研究了如何通过从先前看到的任务中收集的各种离线数据集的有效利用来应对这些挑战。当面对一项新任务时，我们的系统会适应以前学习的技能，以快速学习执行新任务并将环境返回到初始状态，从而有效地执行自己的环境重置。我们的经验结果表明，将先前的数据纳入机器人增强学习中可以实现自主学习，从而大大提高了学习的样本效率，并可以更好地概括。

translated by 谷歌翻译

Differentiable Rendering of Neural SDFs through Reparameterization

Sai Praveen Bangaru , Michaël Gharbi , Tzu-Mao Li , Fujun Luan , Kalyan Sunkavalli , Miloš Hašan , Sai Bi , Zexiang Xu , Gilbert Bernstein , Frédo Durand

分类：计算机视觉

2022-06-10

我们提出了一种方法，可以在神经SDF渲染器中相对于几何场景参数自动计算正确的梯度。最近基于物理的可区分渲染技术用于网格采样来处理不连续性，尤其是在对象轮廓上，但是SDF没有简单的参数形式，可用于采样。取而代之的是，我们的方法建立在区域采样技术的基础上，并为SDFS开发了连续的翘曲功能，以解决这些不连续性。我们的方法利用了在SDF中编码的表面的距离，并在球形示踪剂点上使用正交来计算此翘曲功能。我们进一步表明，这可以通过对要点进行次采样来使神经SDF的方法进行。我们可区分的渲染器可用于优化从多视图图像中的神经形状，并对最近基于SDF的反向渲染方法产生可比较的3D重建，而无需2D分割掩码来指导几何形状优化，而无需对几何形状进行体积近似。

translated by 谷歌翻译

Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator

Homer Walke , Daniel Ritter , Carl Trimbach , Michael Littman

分类：人工智能

2021-11-07

有限的线性时间逻辑（$ \ mathsf {ltl} _f $）是一种强大的正式表示，用于建模时间序列。我们解决了学习Compact $ \ Mathsf {ltl} _f $ formul的问题，从标记的系统行为的痕迹。我们提出了一部小说神经网络运营商，并评估结果架构，神经$ \ mathsf {ltl} _f $。我们的方法包括专用复发过滤器，旨在满足$ \ Mathsf {ltl} _f $ temporal运算符，以学习痕迹的高度准确的分类器。然后，它离散地激活并提取由学习权重表示的真相表。此实话表将转换为符号形式并作为学习公式返回。随机生成$ \ Mathsf {LTL} _F $公式显示神经$ \ MATHSF {LTL} _F $尺寸，比现有方法更大，即使在存在噪声时也保持高精度。

translated by 谷歌翻译